Primary exercises

In the survey dataset:

  1. Select teenagers, assume age range between and including 10 and 19.
filter(survey, 10<=age & age<20)
# A tibble: 169 x 13
   name    gender span1 span2 hand  fold  pulse clap    exercise smokes height m.i        age
   <chr>   <chr>  <dbl> <dbl> <chr> <chr> <dbl> <chr>   <chr>    <chr>   <dbl> <chr>    <dbl>
 1 Alyson  female  18.5  18   right right    92 left    some     never    173  metric    18.2
 2 Todd    male    19.5  20.5 left  right   104 left    none     regul    178. imperial  17.6
 3 Gerald  male    18    13.3 right left     87 neither none     occas     NA  <NA>      16.9
 4 Andre   male    17.7  17.7 right left     83 right   freq     never    183. imperial  18.8
 5 Edward  male    20    19.5 right right    72 right   some     never    175  metric    19  
 6 Alfred  male    21    21   right right    68 left    freq     never     NA  <NA>      18.2
 7 Bernice female  16    16   right left     NA right   some     never    155  metric    18.8
 8 Velma   female  19.5  20.2 right left     66 neither some     never    155  metric    17.5
 9 Eddie   male    16    15.5 right right    60 right   some     never     NA  <NA>      17.2
10 Fern    female  17.5  17   right right    NA right   freq     never    156  metric    17.2
# … with 159 more rows
  1. Select all females with pulse equal to 60
filter(survey, pulse==60 & gender=="female")
# A tibble: 4 x 13
  name    gender span1 span2 hand  fold  pulse clap    exercise smokes height m.i        age
  <chr>   <chr>  <dbl> <dbl> <chr> <chr> <dbl> <chr>   <chr>    <chr>   <dbl> <chr>    <dbl>
1 Elnora  female  18    17.6 right right    60 right   some     occas    168  metric    18.4
2 Lavonne female  17.5  17.5 right right    60 right   freq     never    166. metric    23.2
3 Dianna  female  16    15.5 right left     60 left    freq     never    163. imperial  17.4
4 Patrica female  16.5  16.9 right right    60 neither freq     occas    169. metric    29.1
  1. Select all male teenagers with pulse above 60.
filter(survey, pulse>60 & gender=="male" & (10<=age & age<20) )
# A tibble: 59 x 13
   name    gender span1 span2 hand  fold  pulse clap    exercise smokes height m.i        age
   <chr>   <chr>  <dbl> <dbl> <chr> <chr> <dbl> <chr>   <chr>    <chr>   <dbl> <chr>    <dbl>
 1 Todd    male    19.5  20.5 left  right   104 left    none     regul    178. imperial  17.6
 2 Gerald  male    18    13.3 right left     87 neither none     occas     NA  <NA>      16.9
 3 Andre   male    17.7  17.7 right left     83 right   freq     never    183. imperial  18.8
 4 Edward  male    20    19.5 right right    72 right   some     never    175  metric    19  
 5 Alfred  male    21    21   right right    68 left    freq     never     NA  <NA>      18.2
 6 Virgil  male    19.4  19.2 left  right    74 right   some     never    183. imperial  18.3
 7 Richard male    21    20.9 right right    78 right   freq     never    177  metric    17.9
 8 Virgil  male    21.5  22   right right    72 left    freq     never    190. imperial  17.9
 9 Troy    male    20.1  20.7 right left     72 right   freq     never    180. imperial  18.2
10 Charlie male    18.5  18   right left     64 right   freq     never    180. imperial  17.8
# … with 49 more rows
  1. How many males do smoke and never exercise?
# The conditions are 'do smoke' and 'never exercise'. With 'do smoke' we have the categories {regul,occas,heavy} therefore 'or' (|) logical connective is in place. For 'never exercise' we have the category 'none'. Since both conditions must be true we will have the 'and' (&) logical connective.

nrow( filter(survey, (smokes=="regul" | smokes=="occas" | smokes=="heavy") & exercise=="none" & gender=="male" ) )
[1] 4
# Alternatively: a shorter condition for 'smokes' is smokes!="never". It means accept all values for 'smokes' as long as it is not equal to "never" and those are exactly {regul,occas,heavy}.

nrow( filter(survey, smokes!="never" & exercise=="none" & gender=="male" ) )
[1] 4
  1. How many females never smoke and frequently exercise?
nrow( filter(survey, smokes=="never" & exercise=="freq" & gender=="female") )
[1] 38
  1. Reproduce the following tibbles:

6.1) Personal information {Name,Age,Gender,Height} of all teenagers.

filter(select(survey, Name=name, Age=age, Gender=gender, Height=height ), 10<=Age & Age<20)
# A tibble: 169 x 4
   Name      Age Gender Height
   <chr>   <dbl> <chr>   <dbl>
 1 Alyson   18.2 female   173 
 2 Todd     17.6 male     178.
 3 Gerald   16.9 male      NA 
 4 Andre    18.8 male     183.
 5 Edward   19   male     175 
 6 Alfred   18.2 male      NA 
 7 Bernice  18.8 female   155 
 8 Velma    17.5 female   155 
 9 Eddie    17.2 male      NA 
10 Fern     17.2 female   156 
# … with 159 more rows

6.2) Personal information of males with Height between and inclusive 170 to 180.

filter(select(survey, Name=name, Age=age, Gender=gender, Height=height),
        170<=Height & Height<=180 & Gender=="male")
# A tibble: 45 x 4
   Name      Age Gender Height
   <chr>   <dbl> <chr>   <dbl>
 1 Todd     17.6 male     178.
 2 Edward   19   male     175 
 3 Richard  17.9 male     177 
 4 Joe      17.5 male     173.
 5 Floyd    18.1 male     175.
 6 Russell  17.5 male     180 
 7 George   17.2 male     180 
 8 Mathew   19.9 male     171 
 9 Willard  18.9 male     180 
10 Andrew   19.4 male     170 
# … with 35 more rows

Extra exercises

  1. What is the percentage of males who never smoke and frequently exercise? Do the same for female.
# men
none_smoker_sportsmen <- nrow( filter(survey, smokes=="never"  & 
                                              exercise=="freq" & 
                                              gender=="male") )
total_men <- nrow( filter(survey, gender=="male") ) 
none_smoker_sportsmen / total_men
[1] 0.4051724
# women
none_smoker_sportswomen <- nrow( filter(survey, smokes=="never"  & 
                                                exercise=="freq" & 
                                                gender=="female") )
total_women <- nrow( filter(survey, gender=="female") ) 
none_smoker_sportswomen / total_women
[1] 0.3247863
  1. What is the age range in teenagers? You may use the range function (?range).
teenagers <- filter(survey, 10<=age & age<20) 
range(teenagers[["age"]])
[1] 16.750 19.917
  1. How many males do smoke and never exercise? Use ‘%in%’ operator see ?match for more details.
nrow( filter(survey, (smokes %in% c("regul","occas", "heavy")) & exercise=="none" & gender=="male" ) )
[1] 4


Copyright © 2021 Biomedical Data Sciences (BDS) | LUMC